Goto

Collaborating Authors

 randomness deficiency


Toward Universal Laws of Outlier Propagation

arXiv.org Artificial Intelligence

We argue that Algorithmic Information Theory (AIT) admits a principled way to quantify outliers in terms of so-called randomness deficiency. For the probability distribution generated by a causal Bayesian network, we show that the randomness deficiency of the joint state decomposes into randomness deficiencies of each causal mechanism, subject to the Independence of Mechanisms Principle. Accordingly, anomalous joint observations can be quantitatively attributed to their root causes, i.e., the mechanisms that behaved anomalously. As an extension of Levin's law of randomness conservation, we show that weak outliers cannot cause strong ones when Independence of Mechanisms holds. We show how these information theoretic laws provide a better understanding of the behaviour of outliers defined with respect to existing scores.


Algorithmic learning of probability distributions from random data in the limit

arXiv.org Machine Learning

We study the problem of identifying a probability distribution for some given randomly sampled data in the limit, in the context of algorithmic learning theory as proposed recently by Vinanyi and Chater. We show that there exists a computable partial learner for the computable probability measures, while by Bienvenu, Monin and Shen it is known that there is no computable learner for the computable probability measures. Our main result is the characterization of the oracles that compute explanatory learners for the computable (continuous) probability measures as the high oracles. This provides an analogue of a well-known result of Adleman and Blum in the context of learning computable probability distributions. We also discuss related learning notions such as behaviorally correct learning and orther variations of explanatory learning, in the context of learning probability distributions from data.


Approximation of the Two-Part MDL Code

arXiv.org Artificial Intelligence

In machine learning pure applications of MDL are rare, partially because of the difficulties one encounters trying to define an adequate model code and data-to-model code, and partially because of the operational difficulties that are poorly understood. We analyze aspects of both the power and the perils of MDL precisely and formally. Let us first resurrect a familiar problem from our childhood to illustrate some of the issues involved. The process of solving a jigsaw puzzle involves an incremental reduction of entropy, and this serves to illustrate the analogous features of the learning problems which are the main issues of this work. Initially, when the pieces come out of the box they have a completely random ordering. Gradually we combine pieces, thus reducing the entropy and increasing the order until the puzzle is solved. In this last stage we have found a maximal ordering. Suppose that Alice and Bob both start to solve two versions of the same puzzle, but that they follow different strategies. Initially, Alice sorts all pieces according to color, and Bob starts by sorting the pieces according to shape.